The inhomogeneous evolution of subgraphs and cycles in complex networks 
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Subgraphs and cycles are often used to characterize the local properties of complex networks. 
Here we show that the subgraph structure of real networks is highly time dependent: as the network 
grows, the density of some subgraphs remains unchanged, while the density of others increase at 
a rate that is determined by the network's degree distribution and clustering properties. This 
inhomogeneous evolution process, supported by direct measurements on several real networks, leads 
to systematic shifts in the overall subgraph spectrum and to an inevitable overrepresentation of 
some subgraphs and cycles. 
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Subgraphs, representing a subset of connected vertices 
in a graph, provide important information about the 
structure of many real networks. For example, in cellu- 
lar regulatory networks feed-forward loops play a key role 
in processing regulatory information |1| , while in protein 
interaction networks highly connected subgraphs repre- 
sent evolutionary conserved groups of proteins 01. In 
a similar vain, cycles, a special class of subgraphs, of- 
fer evidence for autonomous behavior in ecosystems 31, 
cyclical exchanges give stability to social structures ^, 
and cycles contribute to reader orientation in hypertext 
[^. Finally, understanding the nature and frequency of 
cycles is important for uncovering the equilibrium prop- 
erties of various network models . 

Motivated by these practical and theoretical questions, 
recently a series of statistical tools have been introduced 
to evaluate the abundance of subgraphs P, 01 ^.nd 
cycles 0, 1^ E, Ol, offering a better description of a 
network's local organization. Yet, most of these meth- 
ods were designed to capture the subgraph structure of 
a specific snapshot of a network, characterizing static 
graphs. Most real networks, however, are the result 
of a growth process, and continue to evolve in time 
[l^ . While growth often leaves some of the network's 
global features unchanged, it does alter its local, sub- 
graph based structure, potentially modifying everything 
from subgraph densities to cycle abundance. Yet, the 
currently available statistical methods cannot anticipate 
or describe such potential changes. 

In this paper we show that during growth the subgraph 
structure of complex networks undergoes a systematic re- 
organization. We find that the evolution of the relative 
subgraph and cycle abundance can be predicted from the 
degree distribution P{k) and the degree dependent aver- 
age clustering coefficient C(fc). The results indicate that 
the subgraph composition of complex networks changes 
in a very inhomogeneous manner: while the density of 
many subgraphs is independent of the network size, they 
coexist with a class of subgraphs whose density increases 





FIG. 1: Examples of subgraphs and cycles with a central 
vertex. The subgraph shown in (a) has n = 5 vertices and 
n—l+t = 5 edges, where t = 1 represents the number of edges 
connecting the neighbors of the central vertex (empty circle) 
together. In (b) we show a subgraph with t = 3 edges among 
the neighbors, such that the central vertex and its neighbors 
form a cycle of length h — 5, highlighted by the dotted circle 



at a subgraph dependent rate as the network expands. 
Therefore in the thermodynamic limit a few subgraphs 
will be highly overrepresented 0, a prediction that is 
supported by direct measurements on a number of real 
networks for which time resolved network topologies are 
available. This finding questions our ability to character- 
ize networks based on the subgraph abundance obtained 
from a single topological snapshot. We show that a com- 
bined understanding of network evolution and subgraph 
abundance offers a more complete picture. 

Subgraphs: We consider subgraphs with n vertices and 
n—l + t edges, whose central vertex has links to n — 1 
neighbors, which in turn have t links among themselves 
(Fig. ^). The total number of n-node subgraphs that 
can pass by a node with degree k is ( Jli)- Each of these 
n-node subgraphs can have at most Up = (n— l)(n — 2)/2 
edges between the n — 1 neighbors of the central node. 
The probability that there is an edge between two neigh- 
bors of a degree k vertex is given by the clustering coef- 
ficient C{k). Therefore, the probability to obtain t con- 
nected pairs and Up — t disconnected pairs is given by the 
binomial distribution of Up trials with probability C{k). 
The expected number of (n, t) subgraphs in the network 
is obtained after averaging over the degree distribution. 
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TABLE L Characteristic exponents of the investigated real 
networks and the deterministic model. The exponents are 
defined through the scaling of the degree distribution Pik) ~ 
k the clustering coefficient C(fc) = Cok~°', with Co ~ A'^^, 
the largest degree kmax ^ , and the number of /i-cycles 
Nh/N ~ N'^'-. 

resulting in 

(1) 

where fcmax is the maximum degree and the geometric 
factor gnt takes into account that the same subgraph can 
have more than one central vertex. For instance, a tri- 
angle will be counted three times since each vertex is 
connected to the others, therefore g^i = 1/3. For net- 
works where P{k) k~'^ and C{k) ^ k^", where 7 and 
a are the degree distribution and clustering hierarchy ex- 
ponents, in the thermodynamic limit k^ax 00 Eq. 
predicts the existence of two subgraph classes 

^ / Co^S^x^""* : n - 7 - > , Type I , 
TV^lC*, n-j~at<0, Type II . 

Therefore, for the Type I subgraphs the Nnt/N density 
increases with increasing network size, and Nnt/N is in- 
dependent of N for Type II subgraphs. In the following 
we provide direct evidence for the two subgraph types in 
several real networks for which varying network sizes are 
available: co-authorship network of mathematical pub- 
lications [l3| . the autonomous system representation of 
the Intern et [l^ ITsf , and the semantic web of English 
synonyms [l6j. In each of these networks the maximum 
degree increases as k^ax ~ N^ . We estimated S from 
the scaling of the degree distribution moments with the 
graph size, (fc") - Ar«("+i-7)^ ^ith n = 2,3, A- Further- 
more, we find that Cq from C{k) — Cok^" also depends 
on the network size as Co ^ N^ , where 9 can be esti- 
mated using Co = Z]fe>2 ^i^)/J2k>2 ^ giving a bet- 
ter estimate than a direct fit of C{k). The exponents 
characterizing each network are summarized in Table. 

In Fig. 12 we show the density of all five vertex sub- 
graphs {n = 5) as a function of t. For the Internet and 
Language networks Cq increases with N, therefore the 
subgraph's density increases with the network size for all 
subgraphs. This consequence of the non-stationarity of 
the clustering coefficient is subtrated by normalizing N„t 




FIG. 2; Number of (n = 5,t) subgraphs for the co-authorship 
(a), Internet (b), semantic (c) networks and the deterministic 
model (d) as a function of t. Different symbols correspond to 
different snapshots of the networks evolution, from early stage 
(circles) to intermediate (squares) and current {i.e. largest) 
(triangles). N„t depends strongly on t (spanning several or- 
ders of magnitude) making difficult to observe the N depen- 
dence. Thus we normalized all the quantities {Nst, Co and 
N) to the first year available. The arrows correspond to the 
phase boundary 5 — 7 — at — 0, with Type I and II subgraphs 
to the left and right of the arrow, respectively. In the insets 
show the system size dependence we plot log Nst vs log N for 
different values of t. 



by Cq. For the co-authorship graph with a = (Ta- 
ble P), only Type I subgraphs are observed, as predicted 
by ©• In contrast, for the Internet and semantic net- 
works a > 0, therefore the overrepresented Type I phase 
is expected to end approximately at the phase bound- 
ary predicted by l(2Jl. Indeed, left to the arrow denoting 
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the n — J — at phase boundary we continue to observe a 
systematic increase in N^t/NCl, as expected for Type I 
subgraphs. In contrast, beyond the phase boundary the 
subgraph densities obtained for different network; sizes 
are independent of N, collapsing into a single curve. 

We compared our predictions with direct counts in a 
growing deterministic network model [T^ as well, charac- 
terized by a degree exponent 7=l-|-ln3/ln2« 2.6 and 
a degree dependent clustering coefficient C{k) — C^k^"', 
with Co = 2 and a = 1. In Fig. EJi we show the number 
of {n = 5,t) subgraphs for different values of t and graph 
sizes. The arrow indicating the predicted phase transi- 
tion point n — "f — at = clearly separates the Type I from 
the Type II subgraphs, a numerical finding that is sup- 
ported by exact calculations as well. Note that only one 
Type 11 n — 5 subgraph is present in the deterministic 
network, due to its particular evolution rule. 

Cycles: The formalism developed above can be gener- 
alized to predict cycle abundance as well. Consider the 
set of centrally connected cycles shown in Fig. If the 
central vertex has degree k, we can form {f^'^i) different 
groups of h vertices, h — 1 selected from its k neighbors 
and the central vertex. Each ordering of the h — 1 selected 
neighbors corresponds to a different cycle, therefore we 
multiply with half of the number of their permutations 
{h — 1)1 (assuming that 123 is the same as 321). Finally, 
to obtain the number of /i-cycles we multiply the result 
with the probability of having h — 2 edges between con- 
secutive neighbors, C{k)^~'^, and sum over the degree 
distribution P{k), finding 

— ^g, Pik)^^l _\C{k) , (3) 

k=h-l ^ ' 

where is again a geometric factor correcting multiple 
counting of the same cycle. Note that lO represents a 
lower bound for the total number of /i-cycles, which also 
include cycles without a central vertex. Depending on 
the values of /i, 7 and a the sum in ||2Jl may converge 
or diverge in the limit kmax 00. When it converges, 
the density of /i-cycles is independent of N (Type II), 
otherwise it grows with N (Type I) . Since in preferential 
attachment models without cluste ring the density of bi- 
cycles decreases with increasing A'^ [l^l , we conclude that 
clustering is the essential feature that gives rise to the 
observed high /i-cycle number in such real networks like 
the Internet 0. To further characterize the cycle spec- 
trum, we need distinguish two different cases, < a < 1 
and a > 1. 

< a < 1: In the kmax ^ 00 limit the cycle density 
follows 

Nh f , h<h,, 

^ ~ ) ^h-2Al-a)(h-h^) , , W 
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FIG. 3: Number of /i-cycles as computed from lO, using 7 = 
2.5, (a) Co = 1 and a = 0.9, (b) Co = 2 and a = 1.1, and 
kmax = 500 (dashed-dotted), 700 (dashed) and 900 (solid), 
(c) h value at which A^^ has a maximum as a function of 

max • 

where he = (7 — 2a)/(l — a). Therefore, large cycles {h > 
he) are abundant, their density growing with the network 
size N. As a ^ 1 the threshold he ^ 00, therefore 
the range of h for which the density is size independent 
expands significantly. 

Direct calculations using Q show that N/^ exhibits a 
maximum at some intermediate value of h (see Fig. 
already reported for the deterministic model The 
maximum represents a finite size effect, as the charac- 
teristic cycle length h*, corresponding to the maximum 
of Nh, scales as h* ^ kmax (Fig. |2t>). Yet, next we 
show that this behavior is not generic, but depends on 
the value of a. 

a > 1: For all 7 > 2 only Type II subgraphs are ex- 
pected {Nh/N - Cq"^), as suggested by the divergence 
of he in the a — > 1 limit. If Cq > 1 the number of /i-cycles 
continues to exhibit a maximum and the characteristic 
cycle length h* scales as h* ^ kmax- If Co < 1, however, 
the number of /i-cycles decrease with /i, although a small 
local minima is seen for small cycles. More important, in 
this case Nh/N is independent of the network size (see 
Fig. Et), in contrast with the size dependence observed 
earlier (Fig. and fl3)- Thus, for networks with a > 1 
or a = 1 and Co < 1 the cycle spectrum is stationary, 
independent of the stage of the growth process in which 
we inspect the network. 

Our predictions for the cycle abundance are based on 
centrally connected cycles, in which a central vertex is 
connected to all vertices of the cycle (Fig. QJj). In the 
following we show that our predictions capture the scal- 
ing of all /i-cycles as well, not only those that are cen- 
trally connected. For this in Fig. 0]we plot the number 
of h = 3, 4, 5 cycles (i.e. all cycles as well as those that 
are centrally connected) as a function of the graph size 
for the studied real and model networks, together with 
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FIG. 4: Density of all (open symbols) and centrally connected 
(filled symbols) cycles with h — 3 (circles), 4 (squares) and 5 
(diamond) cycles as a function of the graph size. The contin- 
uous lines corresponds with our predictions (Tab. QJ. 



structure of a complex network. However, P{k) and the 
C(fc) functions allow us to predict with high precision 
the future shifts in subgraph densities, indicating that a 
precise knowledge of the global network characteristics 
is needed to fully understand the local structure of the 
network at any moment. These results will eventually 
force us reevaluate a number of concepts, ranging from 
the potential characterization of complex networks based 
on their subgraph spectrum to our understanding of the 
impact of sub grap hs on processes taking place on com- 
plex networks 1 1 9l f20| . 
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our predictions (continuous line). First we note that in 
many cases {h = 3 and 4) the full cycle density and the 
density of the centrally connected cycles overlap. In the 
few cases {h = 5) where there are systematic differences 
between the two densities the iV-dependence of the two 
quantities is the same, indicating that our calculations 
correctly predict the scaling of all cycles. 

For the co-authorship and Internet graphs a < 1 and 
he < 3, therefore the h — 3,4,5 cycles are predicted to 
be in the Type I regime {h > he). In this case Nh/N ^ 
iV^\ where (ft = 0{h - 2) + 5{l - a){h ~ he). For the 
language graph a — 1, therefore (h — 9{h — 2). For the 
deterministic model a direct count of the /i-cycles reveals 
that t hey are of Type II, i.e. their density is independent 
of N Il0|, in agreement with our predictions for a > 1. 
These predictions are shown as continuous lines in Fig. ^ 
indicating a good agreement with the real measurements. 

Our results offer evidence of a quite complex subgraph 
dynamics. As the network grows, the density of the 
Type II subgraphs remains unchanged, being indepen- 
dent of the system size. In contrast, the density of the 
Type I subgraphs increases in an inhomogeneous fash- 
ion. Indeed, each {n,t) subgraph has its own growth 
exponent ^„t, which means that their density increases 
in a differentiated manner: the density of some Type I 
subgraphs will grow faster than the density of the other 
Type I subgraphs. Thus, inspecting the system at sev- 
eral time intervals one expects significant shifts in sub- 
graphs densities. As a group, with increasing network 
size the Type I graphs will significantly outnumber the 
constant density Type II graphs. Therefore the inspec- 
tion of the graph density at a given moment will offer us 
valuable, but limited information about the overall local 
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